7 research outputs found
Recommended from our members
Modified Gravity with Torsion
We are at a point in time when alternative gravitational theories are beginning to be constrained by high precision cosmological and astrophysical data. The work in this thesis focuses on applications of a particular modified gravity theory, the extended Weyl Gauge Theory (eWGT), that was recently developed by Prof. Lasenby and Prof. Hobson. The applications corresponds to subsets of the full theory that are applicable to different cosmological and astrophysical sectors.
We start by investigating a simplified scenario that simulates a famous alternative theory of gravity, Weyl Gravity. Recently a couple of issues have been raised regarding the validity of the theory. Starting from a gauge theory perspective we bring a fresh contribution to the debate. We argue against the classical formulation by showing that the theory cannot support astrophysical matter (introduced by a perfect fluid). Furthermore we extend the theory and show that even if we allow torsion to be present we cannot reach a physical setup. In this process we have discovered interesting properties of the torsion field that could play an important role in generalised cosmological setups.
In the next application we consider a cosmology dictated by a Riemann Lagrangian that can accommodate only radiation. We find new physical behaviour in the perturbed regime that discretises the power spectrum. We prove that the setup admits gravitational waves.
Finally, we construct a new Lagrangian theory for spinning fluids. We show that it is compatible with current literature for a flat space time. Considering its extensibility we believe that it can be widely used in future research
Demonstration of CORNET: A System For Learning Spreadsheet Formatting Rules By Example
Data management and analysis tasks are often carried out using spreadsheet
software. A popular feature in most spreadsheet platforms is the ability to
define data-dependent formatting rules. These rules can express actions such as
"color red all entries in a column that are negative" or "bold all rows not
containing error or failure." Unfortunately, users who want to exercise this
functionality need to manually write these conditional formatting (CF) rules.
We introduce CORNET, a system that automatically learns such conditional
formatting rules from user examples. CORNET takes inspiration from inductive
program synthesis and combines symbolic rule enumeration, based on
semi-supervised clustering and iterative decision tree learning, with a neural
ranker to produce accurate conditional formatting rules. In this demonstration,
we show CORNET in action as a simple add-in to Microsoft Excel. After the user
provides one or two formatted cells as examples, CORNET generates formatting
rule suggestions for the user to apply to the spreadsheet.Comment: 4 Pages, VLDB 2023 Demonstration Trac
DataVinci: Learning Syntactic and Semantic String Repairs
String data is common in real-world datasets: 67.6% of values in a sample of
1.8 million real Excel spreadsheets from the web were represented as text.
Systems that successfully clean such string data can have a significant impact
on real users. While prior work has explored errors in string data, proposed
approaches have often been limited to error detection or require that the user
provide annotations, examples, or constraints to fix the errors. Furthermore,
these systems have focused independently on syntactic errors or semantic errors
in strings, but ignore that strings often contain both syntactic and semantic
substrings. We introduce DataVinci, a fully unsupervised string data error
detection and repair system. DataVinci learns regular-expression-based patterns
that cover a majority of values in a column and reports values that do not
satisfy such patterns as data errors. DataVinci can automatically derive edits
to the data error based on the majority patterns and constraints learned over
other columns without the need for further user interaction. To handle strings
with both syntactic and semantic substrings, DataVinci uses an LLM to abstract
(and re-concretize) portions of strings that are semantic prior to learning
majority patterns and deriving edits. Because not all data can result in
majority patterns, DataVinci leverages execution information from an existing
program (which reads the target data) to identify and correct data repairs that
would not otherwise be identified. DataVinci outperforms 7 baselines on both
error detection and repair when evaluated on 4 existing and new benchmarks.Comment: 13 page
"What It Wants Me To Say": Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models
Code-generating large language models translate natural language into code.
However, only a small portion of the infinite space of naturalistic utterances
is effective at guiding code generation. For non-expert end-user programmers,
learning this is the challenge of abstraction matching. We examine this
challenge in the specific context of data analysis in spreadsheets, in a system
that maps the users natural language query to Python code using the Codex
generator, executes the code, and shows the result. We propose grounded
abstraction matching, which bridges the abstraction gap by translating the code
back into a systematic and predictable naturalistic utterance. In a
between-subjects, think-aloud study (n=24), we compare grounded abstraction
matching to an ungrounded alternative based on previously established query
framing principles. We find that the grounded approach improves end-users'
understanding of the scope and capabilities of the code-generating model, and
the kind of language needed to use it effectively
InstructExcel: A Benchmark for Natural Language Instruction in Excel
With the evolution of Large Language Models (LLMs) we can solve increasingly
more complex NLP tasks across various domains, including spreadsheets. This
work investigates whether LLMs can generate code (Excel OfficeScripts, a
TypeScript API for executing many tasks in Excel) that solves Excel specific
tasks provided via natural language user instructions. To do so we introduce a
new large-scale benchmark, InstructExcel, created by leveraging the 'Automate'
feature in Excel to automatically generate OfficeScripts from users' actions.
Our benchmark includes over 10k samples covering 170+ Excel operations across
2,000 publicly available Excel spreadsheets. Experiments across various
zero-shot and few-shot settings show that InstructExcel is a hard benchmark for
state of the art models like GPT-4. We observe that (1) using GPT-4 over
GPT-3.5, (2) providing more in-context examples, and (3) dynamic prompting can
help improve performance on this benchmark.Comment: Findings of EMNLP 2023, 18 page